Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28836 Parallize the file archival to improve the split times #6616

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mnpoonia
Copy link
Contributor

No description provided.

@mnpoonia
Copy link
Contributor Author

@virajjasani @stoty I am giving it another try. I hope this doesn't fail this time. I am running tests locally and they haven't finished yet. But i am optimistic this time. 🤞🏾

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@stoty
Copy link
Contributor

stoty commented Jan 21, 2025

I still see some failures that are not the usual flakies.

@mnpoonia mnpoonia force-pushed the parallel_HBASE-28836 branch from c9022c1 to 4bd65e7 Compare January 21, 2025 07:51
@mnpoonia mnpoonia force-pushed the parallel_HBASE-28836 branch from 4bd65e7 to d3b481d Compare January 21, 2025 08:06
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 26s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 2m 46s master passed
+1 💚 compile 2m 55s master passed
+1 💚 checkstyle 0m 33s master passed
+1 💚 spotbugs 1m 26s master passed
+1 💚 spotless 0m 39s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 44s the patch passed
+1 💚 compile 2m 54s the patch passed
+1 💚 javac 2m 54s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 35s the patch passed
+1 💚 spotbugs 1m 33s the patch passed
+1 💚 hadoopcheck 10m 31s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 41s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 9s The patch does not generate ASF License warnings.
34m 32s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6616/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6616
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 7d0a5d7478fb 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / d3b481d
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 85 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6616/3/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@mnpoonia
Copy link
Contributor Author

@stoty I created same PR on 2.5 branch #6615
And i don't see any failures there. I will check the failures on this PR and will try them locally after current build finishes.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 26s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 1s master passed
+1 💚 compile 0m 52s master passed
+1 💚 javadoc 0m 26s master passed
+1 💚 shadedjars 5m 36s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 48s the patch passed
+1 💚 compile 0m 53s the patch passed
+1 💚 javac 0m 53s the patch passed
+1 💚 javadoc 0m 26s the patch passed
+1 💚 shadedjars 5m 35s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 207m 44s hbase-server in the patch passed.
232m 9s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6616/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6616
Optional Tests javac javadoc unit compile shadedjars
uname Linux 5627eb7b9770 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / d3b481d
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6616/3/testReport/
Max. process+thread count 5714 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6616/3/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@mnpoonia
Copy link
Contributor Author

@stoty No test failure this time. Please have a look. Let me know if i am missing something here.

Queue<File> failures, String startTime) {
LOG.trace("Archiving {} files concurrently into directory: {}", files.size(), baseArchiveDir);

ExecutorService executorService = Executors.newCachedThreadPool();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most such similar pools are configurable.

Have you configured making the thread pool configurable ?
Would it make sense to use a global pool here, and limit the number of concurrent move operations ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering we are shutting down the threads after execution is it okay if we give some valid constant rather than a configuration? I am of the opinion that one more configuration would not help us. I also understand that having a max cap on number of threads is an important aspect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I don't have a lot of experience with object storage deletion performance.

Are resolveAndArchive calls serial, or is it possible to have multiple invocations running at the same time ?

What do you think @wchevreuil, @BukrosSzabolcs ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants